Error Analysis of English-Chinese Machine Translation

نویسندگان

  • Fei Fang
  • Shi-Li Ge
  • Rou Song
چکیده

In order to explore a practical way of improving machine translation (MT) quality, the error types and distribution of MT results have to be analyzed first. This paper analyzed English-Chinese MT errors from the perspective of naming-telling clause (NT clause, hereafter). Two types of text were input to get the MT output: one was to input the whole original English sentences into an MT engine; the other was to parse English sentences into English NT clauses, and then input these clauses into the MT engine in order. The errors of MT output are categorized into three classes: incorrect lexical choices, structural errors and component omissions. Structural errors are further divided into SVstructure errors and non-SV-structure errors. The analyzed data shows firstly, the major errors are structural errors, in which non-SV-structural errors account for a larger proportion; secondly, translation errors decrease significantly after English sentences are parsed into NT clauses. This result reveals that non-SV clauses are the main source of MT errors, and suggests that English long sentences should be parsed into NT clauses before they are translated.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Improving Arabic-Chinese Statistical Machine Translation using English as Pivot Language

We present a comparison of two approaches for Arabic-Chinese machine translation using English as a pivot language: sentence pivoting and phrase-table pivoting. Our results show that using English as a pivot in either approach outperforms direct translation from Arabic to Chinese. Our best result is the phrase-pivot system which scores higher than direct translation by 1.1 BLEU points. An error...

متن کامل

The NICT/ATR speech translation system for IWSLT 2008

This paper describes the National Institute of Information and Communications Technology/Advanced Telecommunications Research Institute International (NICT/ATR) statistical machine translation (SMT) system used for the IWSLT 2008 evaluation campaign. We participated in the Chinese– English (Challenge Task), English–Chinese (Challenge Task), Chinese–English (BTEC Task), Chinese–Spanish (BTEC Tas...

متن کامل

A Comparative Study of English-Persian Translation of Neural Google Translation

Many studies abroad have focused on neural machine translation and almost all concluded that this method was much closer to humanistic translation than machine translation. Therefore, this paper aimed at investigating whether neural machine translation was more acceptable in English-Persian translation in comparison with machine translation. Hence, two types of text were chosen to be translated...

متن کامل

機器翻譯為本的中文拼字改錯系統 (Chinese Spelling Checker Based on Statistical Machine Translation)

Chinese spelling check is an important component for many NLP applications, including word processor and search engines. However, compared to checkers for alphabetical languages (e.g., English or French), Chinese spelling checkers are more difficult to develop, because there are no word boundaries in Chinese writing system, and errors may be caused by various Chinese input methods. In this pape...

متن کامل

Word Graphs for Statistical Machine Translation

Word graphs have various applications in the field of machine translation. Therefore it is important for machine translation systems to produce compact word graphs of high quality. We will describe the generation of word graphs for state of the art phrase-based statistical machine translation. We will use these word graph to provide an analysis of the search process. We will evaluate the qualit...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2016